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Testing  for  a  Monotone  Trend  in  a  Modulated 
Renewal  Process 

P.  A.  W.  Lewis*  and  D.  W.  Robinson* 

Abstract .   In  examining  point  processes  which  are  overdispersed  with  re- 
spect to  a  Poisson  process,  there  is  a  problem  of  discriminating  between 
trends  and  the  appearance  in  data  of  sequences  of  very  long  intervals.   In 
this  case  the  standard  "robust"  methods  for  trend  analysis  based  on  log  trans- 
forms and  regression  techniques  perform  very  poorly,  and  the  standard  exact 
test  for  a  monotone  trend  derived  for  modulated  Poisson  processes  is  not  ro- 
bust with  respect  to  its  distribution  theory  when  the  underlying  process  is 
non-Poisson.   However,  experience  with  data  and  an  examination  of  the  depar- 
tures from  the  Poisson  distribution  theory  suggest  a  modification  to  the 
standard  test  for  trend,  both  for  modulated  renewal  and  general  point  process- 
es.  The  utility  of  the  modified  test  statistic  is  verified  by  examining 
several  sets  of  data,  and  simulation  results  are  given  for  the  distribution  of 
the  test  statistic  for  several  renewal  processes. 

1.   Introduction.   Stochastic  point  processes  or  series  of  events  can  be 
described  either  through  the  sequence  of  times  to  events  {T. } ,  or  through  the 
counting  process  {N  },  where  N  is  the  number  of  events  occurring  in  (0,tj. 
Trends  on  both  serial  number  1  and  on  time  t  are  possible,  but  we  only  consid- 
er the  time  trends  here,  nor  do  we  consider  grouped  data. 

A  fairly  complete  description  of  trend  analysis  for  Poisson  point  proc- 
esses is  given  in  Cox  and  Lewis  [*!]  ,  Lewis  [ll],  Lewis  [10]  and  Brown  [2]. 
In  these  works  there  is  another  minor  difference  which  complicates  matters; 
this  is  that  observation  may  be  for  a  fixed  time  interval  (0,t„]  or  for  a 
fixed  number  n  of  events.   Fixed  time  observation  is  more  common  in  practice 
but  the  fixed  number  case  is  easier  to  simulate,  so  we  consider  both,  depending 
on  convenience.   Except  for  messy  details  the  results  are  essentially  the  same. 

We  will  also  consider  only  the  case  of  a  simple  monotone  trend  in  time  for 
the  process,  extending  the  Poisson  theory  to  the  case  of  more  general  point 
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processes.   In  the  case  of  a  non-homogeneous  or  modulated  Poisson  process  a 
simple  model  [H ,    pp.  ^5]  for  the  rate  A(t), 

(1)       A(t)  =  exp{a+Bt}  =  Aexp{8t},  t>0,  A>0, 

leads  to  a  uniformly  most  powerful  conditional  test  for  8=0  against  8^0  based 
on  the  statistic 


Nt 


The  conditioning  is  on  N   ,  the  observed  number  of  events  in  (0,t0]  ,  since  N 

t0  t0 

is  a  sufficient  statistic  for  the  nuisance  parameter  a  for  all  8.   Conditional- 
ly the  statistic  has  mean  N  /2  and  variance  N.  2/12,  so  the  statistic 


(Vt,)  -  | 

(2)        U  =  i , £■ 

(n/12)"2 


which  converges  rapidly  to  a  unit  normal  variable  under  the  null  hypothesis,  is 
used  to  test  for  8=0.   The  hypothesis  is  rejected  for  large  or  small  values 
of  U. 

The  test  statistic  U  is  computed  in  the  SASE  IV   program  for  the  analysis 
of  point  processes  [13]  and  the  program  stops  if  |u|>1.96,  since  subsequent 
analysis  in  the  program  is  for  stationary  processes.   However,  most  users  by- 
pass this  stop  because  it  almost  always  occurs.   This  has  led  to  the  present 
work,  the  supposition  being  that  the  distribution  theory  of  U  is  very  sensitive 
to  the  Poisson  hypothesis.   Two  sets  of  data  which  lead  to  this  program  stop 
are  discussed  in  the  next  section.   Then  other  possible  test  statistics  are 
discussed  Section  3,    and  the  distribution  of  a  statistic  similar  to  ][T.  is 
examined  for  the  special  case  of  a  Gamma  renewal  process.   This  leads  to  a 
simple  modification  of  the  test  statistic  to  account  for  the  overdispersion  of 
the  intervals  between  events  relative  to  the  exponential  distribution. 


In  subsequent  sections  simulation  results  for  the  null  distribution  of  the 
statistic  are  given  for  other  renewal  processes.   Then  the  modification  of  the 
test  which  is  required  for  general  point  processes  is  discussed.   It  is  the 
simplicity  of  the  extension  in  this  general  case  which  makes  the  test  statistic 
attractive  when  compared  to  other  possibilities.   The  problem  of  the  power  of 
different  tests  for  trend  has  not  been  considered. 

Finally  we  note  that  the  situation  we  are  Interested  in  is  that  in  which 
the  point  process  is  overdispersed  with  respect  to  the  Poisson  process.   This 
will  be  defined  to  be  the  situation  in  which  the  index  of  dispersion  for  counts 
[4,  pp.  71]  , 

var  (n.  ] 
I  =  lim  J(t)  = 


t-*-°°  E 


N' 


is  greater  than  one,  its  value  for  the  Poisson  process.  For  the  most  part  this 
corresponds  to  the  marginal  distribution  of  times  between  events  having  a  coef- 
ficient of  variation 


C(x)  ~  ETxT 


greater  than  1.   This  is  always  true  for  renewal  processes,  and  for  cluster 
processes  (see  [12]  and  [8]). 

2.   Data  Analysis.   Two  sets  of  data  are  examined  here  and  the  results  of 
tests  for  trend  based  on  U  are  discussed. 

Statistics  for  the  first  set  are  tabulated  in  Table  1.   This  set  consists 
of  3  sequences  of  page  exceptions  in  a  multiprogrammed  two-level  memory  comput- 
er with  demand  paging  [14].   There  is  no  particular  compelling  reason  to  expect 
a  monotone  trend  in  the  data,  except  for  an  initial  transient.   This  transient 
occurs   because  no  page  exception  can  occur  until  the  memory  is  filled  to  the 
exception  levels,  which  are  76,  197,  and  512  in  the  three  sequences  examined. 
The  transient  is  almost  negligible  at  level  76,  where  the  test  based  on  U 
(column  4)  rejects  homogeneity  at  a  1%   level.   The  rejection  is  stronger  for 
the  other  levels,  and  at  exception  level  512  there  is  a  very  long  transient  and 


therefore  inhomogeneity . 

Note  however  that  the  intervals  between  events  are  very  skewed  with  res- 
pect to  the  exponential  distribution,  the  coefficients  of  variation  given  in 
column  5  being  on  the  order  of  3,  compared  to  1  for  an  exponentially  distrib- 
uted variate,  and  the  coefficients  of  skewness  Y-,  given  in  column  6  of  Table  1 
being  greater  than  the  value  Y-,=2  for  the  exponential  distribution. 

An  even  more  striking  failure  for  the  test  occurs  in  the  second  set  of 
data  explored  in  Table  2.   The  events  are  occurrences  of  earthquakes  with 
energies  greater  than  4.0  on  the  Richter  scale  in  California  and  Nevada  from 
193?  to  1969.   Six  sections  with  equal  numbers  of  events  (except  for  the  last) 
were  analyzed  and  their  statistics  are  given  on  the  first  six  rows  of  Table  2. 
Columns  5  to  7  show  that  the  intervals  are  very  skewed,  and  the  estimated 
serial  correlation  coefficients  p   in  column  8  show  the  intervals  to  be 
correlated . 

There  is  no  particular  reason  to  expect  a  monotone  trend  in  this  data, 
but  |u|  is  greater  than  I.96  for  all  sections.   The  average  of  the  U  values  is 
-0.72  and  the  estimate  of  the  standard  deviation  of  U  for  the  sections  (the 
sample  standard  deviation  of  the  6  U's)  is  shown  in  row  9,    column  4  to  be 
0=7.82.   This  is  far  in  excess  of  the  value  of  a=l  for  the  U  statistic  under 
the  hypothesis  of  a  homogeneous  Poisson  process. 

We  will  return  to  this  data  later  on. 

3.   General  remarks  on  the  test  statistic.   Neither  of  the  series  consid- 
ered above  can  be  modelled  as  a  renewal  process  since  the  estimated  first 
serial  correlation  coefficients  p,  are  large.   In  fact  the  first  set  has  been 
modelled  as  a  univariate  semi-Markov  process  by  Lewis  and  Shedler  [14 ]  and  the 
earthquake  data  is  well  known  to  be  some  kind  of  cluster  process  (Lewis, [12] ; 
Vere-Jones  [18]). 

It  is  useful  to  consider  renewal  situations  however,  even  If  they  occur 
rarely  in  practice,  because  of  analytical  possibilities.   Cox  [3]  has  extended 
the  model  (1)  to  modulated  renewal  processes  by  defining  the  intensity  function 
X(t)  as 

(3)       X(t)  =  z(u(t))  exp{a+6t}  , 


Table  1.   Page  exceptions  In  a  multlprogrammed  two-level  memory  computer  with 
demand  paging 


Level 

Nt 

t0  (page 
references ) 

u 

C(x) 

Y! 

Pi 

U 

(#  pages) 

{C(x)} 

76 

1,807 

8,802,464 

-2.83 

3-34 

10.34 

+0.188 

-0.85 

197 

820 

8,802,464 

-8.67 

3.27 

7.14 

+0.177 

-2.60 

512 

517 

8,802,464 

-18.11 

3.70 

6.87 

+0.130 

-4.9C 

Table  2.   Earthquake  Data  -  All  earthquakes  with  energies  greater  than  4.0  In 
California  and  Nevada;  1932-1969 


Section 

Nt 

t0  (hours ) 

u 

C(x) 

Yi 

Y2 

P| 

U 

{C(x)} 

1 

468 

72,200 

4.4 

1.8 

5.50 

42.9 

+  0.49 

2.44 

2 

468 

58,921 

-6.7 

1.65 

3.67 

22.4 

+  0.16 

-4.06 

3 

468 

49,733 

9.9 

1.70 

2.80 

12.8 

+  0.22 

5.82 

4 

468 

29,403 

2.1 

1.70 

3.30 

17.5 

+0.14 

1.23 

5 

468 

48,061 

-11.7 

1.50 

2.40 

9.8 

+  0.34 

-7.80 

6 

431 

79,686 

-2.3 

1.25 

2.40 

12.6 

+  0.12 

-1.84 

Average 

-0.72 

1.6 

3.01 

19.67 

0.245 

-.702 

S_ 

X 

(3.19) 

(0.81) 

(0.68) 

(4.99) 

(0.059) 

0 

7.82 

0.197 

1.67 

12.22 

(0.144) 

TOTAL 

2771 

338,004 

-0.527 

I.63 

_ 

_ 

_ 

-0.323 

Record 

where  z(«)  is  the  hazard       [4,   pp.  135]  or  hazard  rate  in  the  terminology 
of  some  workers  in  reliability  theory.   However,  although  a  complete  likelihood 
can  be  set  up  [3]  it  has  not  been  possible  to  derive  any  explicit  tests  for 
6=0  from  it. 

We  therefore  continue  to  examine  modifications  of  the  U  statistic.   For 
convenience,  however,  we  consider  the  case  of  observation  for  a  fixed  number  of 
events  n.   There  are  several  reasons  for  this: 

(i)   The  fixed  number  case  is  much  simpler  to  simulate  and  statistical  dif- 
ferences between  the  two  situations  will  be  minor,  especially  for  large 
samples . 

(ii)   The  sufficient  statistic  for  a  in  the  model  (i)  for  a  Poisson  process  is 

n 

Y.,   =  I  X.  ,  where  X.  are  the  times  between  events  and  the  test  statistic 
In   .--,  l         i 

[4,    p.  52]  is 

n 

m  y^    =    y  s, 

2n    ,\  i 

n 
(5)  =  I    (n+l-i)X, 

1  =  1        x 

Although  this  statistic  can  be  considered  conditionally  on  Y.  ,  it  follows 

In' 

from  well  known  characterizing  results  for  exponential  and  Gamma  distributed 
variates  (see  Lukacs  and  Laha  [15]  )  that  this  is  equivalent  to  considering  the 
test  statistic 


(6)        Y   =  Y2 


n 

1*1 


n   Y,     n 

ln    I  X, 
i  =  l  x 


Moreover  for  any  renewal  model  with  intensity  function  (3)  this  statistic 
will  be  free  of  the  nuisance  parameter  a  for  any  8,  as  can  be  easily  shown. 
This  is  an  important  simplification. 

(iii)  Analytical  results  for  the  fixed  number  case  are  simpler  to  obtain  than 
those  for  the  fixed  time  case.  Moreover  (6)  suggests  several  other  possibili- 
ties.  From  the  form  (5)  for  the  numerator  it  can  be  seen  that  it  is  like  an 


empirical  serial  correlation  between  the  natural  numbers  and  the  serially 
ordered  times  between  events  X  .   This  is  the  form  of  several  standard  tests 
for  trend  [7,  Ch.  45].  A  possibility  would  be  to  replace  the  X  ,'s  by  exponen- 
tial scores  and  correlate  the  serially  ordered  scores  with  the  index  numbers  1, 
Permutation  tests  of  this  sort  have  been  discussed  by  Guillier  [6] ;   we  do  not 
pursue  them  here  because  they  depend  on  the  independence  assumption  in  the  re- 
newal hypothesis  and  we  wish  to  consider  more  general  point  processes  with 
dependent  times-between-events . 

Two  other  possible  tests  for  trend  are  noted  here. 

One  is  based  on  log  transformations  of  the  data  and  standard  regression 
techniques,  but  as  noted  in  Cox  and  Lewis  [4,  pp.  4l]  these  methods  are  likely 
to  have  poor  relative  power  for  intervals  X  which  are  more  dispersed  than 
exponential  variates.   (For  fairly  regular  processes  they  are  likely  to  be  the 
favored  procedures.) 

The  second  possibility  arises  from  an  analogy  between  Y  and  goodness  of 
fit  tests.   Define 

(7)       Cn,l  =  *n  i-1,  ...,  (n-1). 

j-1  J 


Then  if  F  (y)  denotes  [17]  the  empirical  cumulative  distribution 
function  for  £  ,,  1=1,   ...,  (n-1),  we  have 


1  . 
(8)       /  {F  (u)  -  u}du  =  (n+1)  -  Y   . 
0 


Thus  Y   is  essentially  a  one  sided  Cramer-von  Mises  statistic  and  other  norms 

n  J 

could  be  tried  to  measure  the  deviation  of  F  (u)  from  the  function  u  between 
0  and  1. 

Because  the  statistic  Y  and  tests  for  trend  based  on  it  can  be  extended 
to  non-renewal  processes,  we  consider  its  distribution  first  for  Gamma  renewal 
processes,  then  for  several  other  renewal  processes  and  then  for  cluster 


processes . 

4 .   Testing  In  modulated  Gamma  renewal  processes.   The  Gamma  renewal 
process  has  independently  distributed  intervals  with  probability  density  func- 
tion lkt    pp.  136] 


(9)       fx(x)  = 


k>>k   k-l  -kx/u 

ImJ     2L-rfki —  x>0>  k>0> 


where  T(k)  is  the  complete  Gamma  function.  For  k=l  we  have  an  exponentially 
distributed  variate,  and  for  k-h  the  square  of  a  normal  random  variable.  We 
will  be  concerned  with  the  case  k<l.   We  also  have 


(10)       E(X)  =  u;  var  (X)  =  H-  ;   C(X)  =  -!■ 


/F 


Consider  now  the  distribution  of  Y   given  by  (6),  which  we  write   for 
convenience  as 

n 


I    (n+l-i)X1/n    yl 
(11)       Y   =  i=1  2n 


y' 


I    X  /n        *ln 
1=1  x 


The  moments  of  the  numerator  and  denominator  are 


(12)  E(.Y.y  =  M,  var  (Y^J  =  a«/n, 

(13)  E(Y2n)  =  ^n+1^/2>   var  (Y2n)  =  (n+1 }  (2n+1  )o2/(  6n) ' 

Now  it  is  a  characterizing  property  of  Gamma  distributed  variates 
[15,  p.  58]    that  the  expected  value  of  ratios  of  linear  functions  of  the 
Gamma  variates  such  as  those  appearing  in  (11)  is  the  expected  value  of  the 
ratio  of  the  expectations.   Thus  we  have,  for  Gamma  renewal  processes, 


(11)       E(y)  =  (n+l)/2; 


(15) 


var  (Yn) 


(n-1)   (n+1)    (n-1) 
12   (kn+1)     12 


(n+1) 


(n/C2(x)+l) 


(16) 


var 


N  - 


n-1 


12 


C2(x) 


Since  C2(x)  equals  one  for  a  Polsson  process  (k=l),  this  checks  with  re- 
sults for  the  statistic  U  given  in  (2). 
Note  further  that 

n 


Y*  =  y     —  =  1*1 

n    n    2        n 


n+1  _  1 
2n    n 


I    X./n 
i«l 


(17) 


1    [ 
i=ll 


-  X 


n+l-i 


n+1 
n 


21 

n 


n 

I  X  /n 
1*1  x 


(18) 


I  X^a 
1  =  1  1  1 
n 

1=1  x 


where   p-  is  the  greatest  integer  less  than  or  equal  to  n/2 ,  X.'  =  X  -  X    . 

is  a  symmetric  random  variable  and  a.  is  an  odd  sequence. 

Using  (18)  we  can  show  the  following  results: 

(i)   The  centered  statistic  Y  has  odd  moments  which  are  all  zero.   This  fol- 

n 

lows  because  the  numerator  in  (18)  is  a  sum  of  independent  symmetric  random 
variables  and  is  therefore    [5,  Lemma  2,  p.  1^9]  itself  symmetric. 
This  implies  that  the  odd  moments  of  the  numerator  (including  the  first)  are 
zero  and  by  the  Lukacs  and  Laha  result  cited  above,  so  are  those  of  y' . 


Thus  ¥'  is  a  symmetric  random  variable. 

(ii)   The  numerator  in  (18)  divided  by  (n) 2  is  asymptotically  normal.   Moreover 
since  the  denominator  converges  with   probability  one  to  u,  which  is  non-zero, 
results  from  Billingsley    [1,  Corollary  2,  p.  31]  show  that  the  reciprocal  of 
the  denominator  converges  with  probability  one  to  1/y .   Slutsky's  Theorem  (see 
Billingsley  [1]   )  then  says  that 


(19) 


12 


Y'   L 
n 


^y^      *   "(CD. 


(iii)   Convergence  to  the  normal  distribution  is  likely  to  be  very  rapid  be- 
cause of  the  symmetry  of  the  distribution  of  Y  . 

To  examine  the  small  sample  distribution  of  Y   for  the  Gamma  renewal  case 

r  n 

an  extensive  simulation  was  undertaken.   Detailed  results  are  given  in  Robinson 
[16J  .     The  results  are  illustrated  in  Table  3>  which  is  extracted  from 
Robinson  [16]  . 

The  simulations  involved  100,000  replications  using  the  random  number 
generator  LLRANDOM  (Learmonth  and  Lewis  [9]  )    and  a  Gamma  random  number  gener- 
ator developed  by  Robinson  [16]  .     The  computations  were  checked  by  comparing 
the  theoretical  results  for  the  mean  and  variance  of  the  statistics  with  the 
simulated  mean  and  variance. 

Only  the  case  k=0.1  (c2(X)=10)  is  given  in  Table  3  because  this  was  the 
most  extreme  case  simulated  and  has  the  greatest  departure  from  normality  and 
the  slowest  convergence  to  the  asymptotic  normal  form.   Simulated  quantiles  of 
Y  ,  normalized  by  subtracting  the  mean  (14)  and  dividing  by  the  square  root  of 
the  variance  (15)  (these  are  listed  in  the  last  two  rows  of  the  table)  are 
shown  in  Table  3.   Because  of  the  symmetry  of  the  distribution,  only  the  lower 
quantiles  corresponding  to  levels  a=0.001,  0.002,  0.005,  0.010,  0.020,  0.025, 
0.050,  0.100,  0.200,  0.300,  0.400,  0.500  are  given.   They  are  actually  the 
average  of  the  simulated  upper  and  lower  quantiles  and  have  a  standard  devia- 
tion of  approximately  0.001. 

The  distribution  can  be  seen  to  be  a  little  more  peaked  than  a  normal  dis- 
tribution, with  shorter  tails,  but  even  by  n=50  a  normal  approximation  to  the 
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Table  3-   Simulation  results  for  the  statistic  Y   for  Gamma  distributed  inter- 

n     

vals  with  k=0.10  under  the  null  hypothesis  of  no  trend  (fl=0). 

Quantlles  of  Y   are  normalized  by  subtracting  E(Y,)  and  dividing  by 
n  j 

o(Yj). 

a  n=10  n=30  n=50  n=100  Normal  quantile 


0.001 

-2.202 

-2.740 

-2.915 

-3.001 

-3.090 

0.002 

-2.191 

-2. 607 

-2.750 

-2.812 

-2.878 

0.005 

-2.148 

-2.460 

-2.500 

-2.545 

-2.576 

0.010 

-2.078 

-2.231 

-2.290 

-2.313 

-2.326 

0.020 

-1.944 

-2.014 

-2.049 

-2.054 

-2.054 

0.025 

-1.875 

-1.935 

-1.960 

-1.965 

-1.960 

0.050 

-1.654 

-1.665 

-1.665 

-1.656 

-1.645 

0.100 

-1.343 

-1.320 

-1.307 

-1.297 

-1.282 

0.200 

-0.924 

-0.881 

-0.871 

-0.856 

-0.842 

0.  300 

-0.591 

-0.554 

-0.549 

-0.537 

-0.524 

0.400 

-0.279 

-0.272 

-0.265 

-0.261 

-0.253 

0.500 

-0.001 

-0.005 

0.002 

-0.003 

0.000 

E(Yn) 

5.5 

15.5 

25.5 

50.5 

0(V 

2.031 

4.328 

5.891 

8.703 
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distribution  of  Y   is  adequate  for  purposes  of  hypothesis  testing. 

The  proposal  for  testing  a  monotone  trend  in  a  Gamma  renewal  process 

derived  from  these  results  is  to  estimate  the  coefficient  of  variation  from 
the  data  and  test  for  3=0  using 


12  ^  2 


n-1 


n 


{C(X)} 


and  assuming  that  its  distribution  is  that  of  a  unit  normal  distribution.   This 
essentially  uses  the  Poisson  test  statistic  divided  by  C(X).   This  modified 
statistic  is  given  in  the  last  columns  of  Tables  1  and  2.   The  test  results 
are  more  in  line  with  expectations,  but  still  do  not  reflect  inflation  of  the 
variance  of  U  because  of  correlation  between  intervals  between  events.   This  is 
discussed  in  Section  6. 

5.  Distributional  results  for  other  renewal  cases.  The  result  (1^)  holds 
for  any  stationary  sequence  X,,  •••,  X  ,  including  a  renewal  (i.i.d.)  sequence. 
This  is  because 


X!   + 


Xl   + 


.  +  X 


n 


+  X 


n 


=  nE 


' 

X. 

1 

* 

[x. 

+   . . . 

+    X 
n 

=  1, 


or  E 


2{x./fX  +  ...  +  X  1}  =  —  for  1=1,  •••,  n.   Taking  expectations  in  (6)  and 


using  the  form  (5)  for  Y~  yields 


:N 


n+l 


This  result  merely  says  that  Y  ,  which  is  a  normalized  centroid  of  times  to 

events  in  an  interval  stationary  point  process,  always  has  the  expected  value 

(n+l)/2. 

Thus  the  centering  in  (17)  is  correct  for  all  sequences  and  we  discuss 

Y   from  here  on. 
n 

Another  useful  result  is  that  Y   is  a  symmetric  random  variable  for  any 
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renewal  sequence.   To  see"  this  note  that  -Y '  can  be  written  exactly  in  the 

form  (18)  with  X1  =  X-n+1_1   -   ^ ,  but  since  these  are  symmetric  random  variables 

and  the  X  's  are  Independent,  the  functional  form  for  -Y '  is  exactly  the  same 
i  n 

as  that  for  Y '  .   Thus  they  have  the  same  distribution  and  thus  y'  is  symmetri- 
n  n     J 

cal  random  variable.   All  odd  moments  are  thus  zero.   In  addition  by  arguments 
of  the  previous  section,  Y^  is  asymptotically  normal  with  variance  (16)  if 
var  (X)<oo  for  any  renewal  process. 

To  explore  the  small  sample  distribution  of  Y   further  for  renewal  pro- 
cesses  using  simulation  we  chose  two  other  density  functions  for  the  intervals 

The  first  is  the  Weibull   density  function 

(20)      fx(x)  =  k6kxk_1exp(-6kxk)  6>0,  k>0,  x>0 

which  reduces  to  the  exponential  for  k=l.   In  the  simulation  the  parameters 
were  chosen  so  that  the  means  and  coefficients  of  variations  of  the  intervals 
X  were  the  same  as  for  the  Gamma  cases. 

The  second  density  function  chosen  was  the  log-normal  density,  again  with 
parameters  chosen  to  match  the  means  and  coefficients  of  variations  in  the 
Gamma  cases.   Note  that  both  these  densities  are,  for  given  coefficient  of 
variation,  more  skewed  than  the  Gamma  density,  the  log-normal  more  so  than  the 
Weibull.   In  addition  both  have  hazard  functions  which  approach  zero  as  x-*-°°, 
in  contrast  to  the  Gamma  density  which  has  an  exponential  tail. 

It  is  possible  to  compute  var(Y  )  for  finite  n  in  both  these  cases,  but 
the  results  are  messy.   In  general  the  variances  are  smaller  than  for  the 
Gamma  case;   simulation  results  give,   when  C2(X)  =10.0  and  n  =  50,  values  of 
5.891,  5.182  and  4.355  for  the  Gamma,  Weibull  and  log-normal  cases 
respectively . 

Only  the  worst  case  of  the  simulations  for  the  Weibull  and  log-normal  in- 
tervals, i.e.,  those  matching  the  Gamma  case  with  C2(X)  =  10.0  are  given,  in 
Table  k    and  5  respectively.   Again  100,000  replications  were  used. 

The  normalized  quantiles  show  distributions  for  If'  at  n  =  10,  30,  50,  100 
for  both  densities   and,  in  addition, for  n  =  200  for  the  log-normal  case.   In 
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Table  4.   Simulation  results  for  the  statistic  Y   for  Welbull  distributed  in- 
terval^ with  C2(x)  =  10.0  under  the  null  hypothesis  of  no  trend 

(3=0).   Quantiles  of  Y   are  normalized  by  subtracting  E(Y  )  and 

n  J  &    n 

dividing  by  d(Y  ) . 


Normal 

a 

n  =  10 

n  =  30 

n  =  50 

n  =  100 

Quantile 

.001 

-2.533 

-2.922 

-3.067 

-3.214 

-3.090 

.002 

-2.473 

-2.772 

-2.845 

-2.973 

-2.878 

.005 

-2.3^3 

-2.521 

-2.570 

-2.635 

-2.576 

.010 

-2.188 

-2.301 

-2. 326 

-2.373 

-2.326 

.020 

-1.987 

-2.042 

-2.052 

-2.069 

-2.054 

.025 

-1.920 

-1.954 

-1.960 

-1.971 

-1.960 

.050 

-1.659 

-1.652 

-1.644 

-1.641 

-1.645 

.100 

-1. 324 

-1.294 

-1.280 

-1.272 

-1.282 

.200 

-0.883 

-0.850 

-0.845 

-0.831 

-0.842 

.  300 

-0.557 

-0.531 

-0.528 

-0.516 

-0.524 

.400 

-0.271 

-0.255 

-0.259 

-0.249 

-0.253 

.500 

-0.002 

-0.000 

0.000 

-0.002 

0.000 

E  Y 
n 

5.500 

15.490 

25.527 

50.495 

SN 

1.678 

3.703 

5.182 

7.953 

M'J 

-0.002 

-0.001 

-0.001 

-0.001 

MT») 

2.52 

2.86 

2.96 

3.14 
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Table  5.   Simulation  results  for  the  statistic  Y   for  log-normal  distributed 

n     — ° 

intervals  with  C2(x)  =  10.0  under  the  null  hypothesis  of  no  trend 
(3=0).   Quantiles  of  Y   are  normalized  by  subtracting  E(Y  )  and 
dividing  by   5(Y  )  . 


a 

n  =  10 

n  =  30 

n  =  50 

n  =  100 

n  =  200 

Normal 
Quant ile 

.001 

-2.941 

-3-342 

-3.452 

-3.692 

-3.831 

-3.090 

.002 

-2.805 

-3.084 

-3.167 

-3.361 

-3-411 

-2.878 

.005 

-2.550 

-2.725 

-2.775 

-2.845 

-2.868 

-2.576 

.010 

-2.325 

-2.434 

-2.445 

-2.471 

-2.472 

-2. 326 

.020 

-2.073 

-2.104 

-2.098 

-2.106 

-2.094 

-2.054 

.025 

-1.975 

-1.997 

-1.991 

-1.988 

-1.978 

-1.960 

.050 

-1.656 

-1.638 

-1.629 

-1.621 

-1.606 

-1.645 

.100 

-1.289 

-1.252 

-1.246 

-1.230 

-1.224 

-1.282 

.200 

-0.843 

-0.813 

-0.804 

-0.791 

-0.786 

-0.842 

.300 

-0.524 

-0.502 

-0.495 

-0.487 

-0.484 

-0.524 

.400 

-0.255 

-0.241 

-0.236 

-0.236 

-0.231 

-0.253 

.500 

0.001 

0.000 

0.003 

-0.003 

0.003 

0.000 

~EN 

5.501 

15.484 

25.518 

50.492 

100.463 

SN 

1.365 

3-059 

4.355 

6.889 

10.699 

MYn) 

-0.004 

0.004 

-0.015 

-0.002 

-0.001 

MYn) 

2.89 

3.35 

3-51 

3.87 

4.11 
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both  cases  the  distributions  have  heavier  tails  than  In  the  Gamma  case,  and 
estimated  kurtoses  y2   greater  than  one.   The  convergence  to  the  asymptotic 
normal  distribution  is  particularly  slow  for  the  log-normal  case,  but  in  no 
case  is  the  normal  approximation  too  far  off  at  the  quantiles  corresponding  to 
the  usual  significance  levels  used  in  hypothesis  testing.   Actually  division 
of  the  quantiles  by  C(X) ( (n-1 )/12) *  from  (16)  rather  than  by  the  true  standard 
deviation  of  Y'  provides  a  better  normal  approximation  than  does  division  of 
the  quantiles  by  the  true  Var  (Y'). 

Convergence  is  of  course  faster  and  the  normal  approximation  better  for 
the  cases  not  shown  here,  i.e.  for  intervals  with  coefficients  of  variation 
approaching  the  value  one  of  the  exponential  distribution.   Note  that  C2(X)  = 
10.0  approximates  the  values  found  for  the  computer  data  of  Table  1. 

6.   Distributional  results  for  general  point  processes.   The  finding  from 
the  previous  sections  was  that  for  renewal  sequences  the  null  hypothesis  vari- 
ance of  Y'  is  inflated  by  approximately  C2(X)  over  Its  value  for  a  Poisson 
process.   The  approximation  is  exact  for  large  n. 

However,  in  both  examples  cited  in  Section  2  the  Intervals  between  events 
X.  are  correlated  (see  the  values  p:  in  Tables  1  and  2).   It  turns  out  that  for 
a  simple  statistic  such  as  Y   fairly  broad  results  can  be  obtained  for  general 
point  processes,  the  modification  to  the  variance  of  Y1  again  being  simple  to 
compute  from  the  data.   Thus  a  rough  test  of  trend  can  be  performed. 

Details  of  the  derivation  will  be  given  elsewhere.   For  a  broad  class  of 
situations  Y1  is  asymptotically  normally  distributed  with  variance 

M  .  (n-1) 


(21)      var  (YM  -  ^f^-   {7rC2(X)ff  (0+)}, 


where  f  (0+)  is  the  initial  point  on  the  spectrum  of  the  intervals  {X  }  of  the 
process.   Since  f  (0+)  is  related  to  the  initial  point  of  the  spectrum  of 
counts,  g,  (0+),  and  the  asymptotic  slope,  V  (oo)sof  the  variance  time  curve, 
var  {N  },  of  the  point  process  by  the  relationship  [4,  p.  78] 


(22)        V'(cc)  =  TTg+(0+)  =  ^'OO  f+(0+)   , 
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we  can  write  (21)  as 

(23)  var  (Y^j  -  &=±1   (E(X)V'  (-) }  . 

The  quantity  V'(«0  is  simple  to  estimate  from  the  data  [4,  pp.  115-120]  , 

thereby  providing  an  easy  modification  for  the  test  statistic  Y'. 

n 

For  a  renewal  process,  f+(0+)  =  1/tt  ,  and  (21)  reduces  to  (16).   Poisson 
cluster  processes  [12,  18]  have  been  used  to  model  the  earthquake  data  of 
Section  2.   If  the  length  of  the  cluster  in  the  cluster  process  is  denoted  by 
S,  we  have 

(24)  var  (YM  ~  iS^U.  E(S+1){1  +  C2(S+1)}, 

where  C2(S+1)  is  the  coefficient  of  variation  squared  of  S+l.   When  there  is  no 
cluster,  i.e.  S=0  with  probability  1,  the  result  (24)  reduces  to  that  for  the 
Poisson  process. 

For  the  earthquake  data,  which  has  long  and  very  variable  clusters,  the 
multiplier  of  (n-l)/12  In  (24)  has  an  estimated  value  of  approximately  49. C. 
Dividing  the  U  values  given  in  column  4  of  Table  2  by  (49)  =7.0,  we  obtain  a 
test  statistic  which  accepts  the  hypothesis  of  no  trend  in  all  6  sections  of 
the  data. 

7.   Conclusions  and  further  work.   The  recommendation  put  forward  in  this 
paper  is  to  test  for  trend  in  a  point  process  using  the  U  statistic  (2)  divided 
by  the  estimated  coefficient  of  variation  C(X)  in  a  renewal  process,  or  an 
estimate  of  {E(X)V  (<*>)}      in  (23)  for  a  general  point  process. 

The  test  is  not  proposed  as  being  in  any  sense  optimal,  but  because  it  can 
be  used  without  detailed  knowledge  of  the  structure  of  the  process  it  is  very 
functional.   It  would  be  nearly  optimal  if  the  point  process  were  close  to  a 
Poisson  process. 

The  power  of  the  test  needs  to  be  investigated  so  that  its  utility  can  be 
assessed  relative  to  other  tests,  especially  for  processes  which  are  highly 
overdlspersed  relative  to  the  Poisson  process.   Point  processes  of  that  type 
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occur  In  many  applications. 

Other  tests  to  be  considered  could  be  standard  regression  tests  after  a 
log  transform  or  scoring  of  the  intervals  in  the  data;  rank  correlation  tests 
using,  perhaps,  exponential  scores  for  the  intervals,  and  other  functlonals 
than  that  given  in  (8)  for  measuring  the  "distance"  of  F  (u)  from  u  (see  [4, 
Ch.  6] ) .   There  are  other  possibilities  explored  in  a  recent  thesis  by  Guillier 
[6]  . 
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